LCORE-1422: Inline RAG (BYOK) e2e tests by are-ces · Pull Request #1370 · lightspeed-core/lightspeed-stack

are-ces · 2026-03-22T14:40:29Z

Description

Add end-to-end tests for inline RAG (BYOK) to verify that deterministic context injection works correctly through the lightspeed-stack configuration.

Scenarios:

Inline RAG source registration appears in /v1/rags with user-facing rag_id
Query with inline RAG returns relevant content from BYOK vector store
Response includes non-empty rag_chunks from inline context injection (no tool calls)

Files added/changed:

Test configs for both server-mode and library-mode (lightspeed-stack-inline-rag.yaml)
Feature file (inline_rag.feature) with 3 scenarios
New step definition for rag_chunks assertion
Environment hooks for @InlineRAG config switching
FAISS_VECTOR_STORE_ID env var added to lightspeed-stack container in docker-compose

Type of change

End to end tests improvement

Tools used to create PR

Assisted-by: Claude Code
Generated by: Claude Opus 4.6

Related Tickets & Documents

Closes LCORE-1422

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Manually verified inline RAG via curl against running service
uv run behave tests/e2e/features/inline_rag.feature — 3/3 scenarios pass
uv run behave tests/e2e/features/faiss.feature — 4/4 existing scenarios pass (no regression)
uv run make format and uv run make verify — all checks pass
Tested in server mode with Docker compose

Summary by CodeRabbit

Tests
- Added end-to-end Inline RAG tests (registration, query relevance, document retrieval), a new step asserting non-empty rag_chunks, and a unit test ensuring env-var-resolved duplicate vector stores are skipped; test runner toggles inline RAG configs.
Configuration
- Introduced FAISS vector-store env var and multiple inline RAG config files for library/server modes; updated CI/test configs and tool listings.
Chores
- Clarified an inline RAG comment and adjusted test container Python path for local src resolution.

coderabbitai · 2026-03-22T14:40:47Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds inline BYOK RAG (FAISS) support and tests: introduces FAISS env var to Compose files, new inline-FAISS Lightspeed YAMLs for library/server modes, E2E feature and step definitions, test hooks to swap configs, unit test for env-var resolution, and env-var-aware duplicate detection in vector-store construction.

Changes

Cohort / File(s)	Summary
Docker Compose `docker-compose.yaml`, `docker-compose-library.yaml`	Added `FAISS_VECTOR_STORE_ID` env var to `llama-stack` and `lightspeed-stack`; updated inline-RAG comment in `docker-compose-library.yaml`.
E2E Inline RAG Configs `tests/e2e/configuration/library-mode/lightspeed-stack-inline-rag.yaml`, `tests/e2e/configuration/server-mode/lightspeed-stack-inline-rag.yaml`	New Lightspeed configs registering inline BYOK FAISS RAG (`e2e-test-docs`) with embedding settings, `vector_db_id` from `${env.FAISS_VECTOR_STORE_ID}`, KV DB path, and runtime fields.
Existing E2E Config Updates `tests/e2e/configs/run-ci.yaml`, `tests/e2e/configuration//lightspeed-stack.yaml`, `tests/e2e/configuration//lightspeed-stack-auth-noop-token.yaml`	Added/merged BYOK inline FAISS entries and `rag.tool` mappings; removed preconfigured FAISS/vector-store entries from `run-ci.yaml` and shields.
E2E Features & Steps `tests/e2e/features/inline_rag.feature`, `tests/e2e/features/faiss.feature`, `tests/e2e/features/steps/llm_query_response.py`	New InlineRAG feature testing registration, inline queries, and document inclusion; added step asserting non-empty `rag_chunks`; adjusted expected `rags` value in existing FAISS feature.
E2E Test Harness `tests/e2e/features/environment.py`, `tests/e2e/test_list.txt`	Added `InlineRAG` handling to swap configs per mode before/after features, restart `lightspeed-stack`, and added the new feature to the test list.
Configuration Logic `src/llama_stack_configuration.py`	Added `_resolve_env_var(value: str)` and updated duplicate detection to resolve `${env...}` placeholders when comparing existing `vector_store_id` entries against BYOK `vector_db_id`.
Unit Tests `tests/unit/test_llama_stack_configuration.py`	Added test ensuring `construct_vector_stores_section` skips duplicates when existing `vector_store_id` uses `${env.VAR}` that resolves to the same concrete ID.
Build / Containerfile `test.containerfile`	Set `PYTHONPATH=/opt/app-root/src` so runtime imports find `src` without installing the package.

Sequence Diagram(s)

sequenceDiagram
    participant Test as Test Scenario
    participant Env as Test Environment
    participant LCS as Lightspeed Stack
    participant FAISS as FAISS Vector DB
    participant LLM as OpenAI LLM

    Test->>Env: Start `@InlineRAG` feature
    Env->>Env: Backup default lightspeed config
    Env->>Env: Load inline-rag YAML (library/server)
    Env->>LCS: Restart container with inline RAG config

    Test->>LCS: GET /v1/rags (auth)
    LCS->>LCS: List registered RAGs (includes e2e-test-docs)
    LCS-->>Test: 200 + rags: ["e2e-test-docs"]

    Test->>LCS: POST /v1/query (with system_prompt)
    LCS->>FAISS: Search vectors for query
    FAISS-->>LCS: Return matching chunks
    LCS->>LLM: Invoke LLM with retrieved context
    LLM-->>LCS: Generated response
    LCS-->>Test: 200 + response (includes rag_chunks)

    Test->>Env: Finish feature
    Env->>LCS: Restore default config and restart

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly identifies the main change: adding end-to-end tests for inline RAG (BYOK) functionality, which directly matches the primary objective of the pull request.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/e2e/configuration/library-mode/lightspeed-stack.yaml (1)

29-29: Use an isolated E2E DB path to avoid cross-run state leakage.

Line 29 points to a shared home-path DB, which can retain prior test state and weaken deterministic E2E behavior.

♻️ Proposed change

-    db_path: ~/.llama/storage/rag/kv_store.db
+    db_path: /tmp/lightspeed/rag/${env.FAISS_VECTOR_STORE_ID}/kv_store.db

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/configuration/library-mode/lightspeed-stack.yaml` at line 29, The
config uses a shared home-path under the db_path key which can leak state
between runs; update the db_path entry in lightspeed-stack.yaml to point to an
isolated per-run test location (for example a temp directory under the repo like
tests/e2e/tmp/kv_store.db or use an env var such as TEST_DB_PATH that the e2e
runner populates), ensuring the test harness creates/cleans that directory
before/after each run so each E2E run uses a fresh KV store.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/e2e/configuration/library-mode/lightspeed-stack.yaml`:
- Line 28: The vector_db_id field currently uses the env placeholder
${env.FAISS_VECTOR_STORE_ID} with no fallback, which can leave the literal
placeholder in config and break FAISS lookups; update the lightspeed-stack
configuration to provide a safe fallback (e.g., use the env fallback syntax like
${env.FAISS_VECTOR_STORE_ID:=<default>} or programmatically validate/abort when
FAISS_VECTOR_STORE_ID is missing) and/or add a clear documentation/validation
step for library-mode entrypoints to require FAISS_VECTOR_STORE_ID; target the
vector_db_id setting and the FAISS_VECTOR_STORE_ID environment usage in the
stack config and any launch validation logic (entrypoint/launcher bootstrap) to
implement this change.

---

Nitpick comments:
In `@tests/e2e/configuration/library-mode/lightspeed-stack.yaml`:
- Line 29: The config uses a shared home-path under the db_path key which can
leak state between runs; update the db_path entry in lightspeed-stack.yaml to
point to an isolated per-run test location (for example a temp directory under
the repo like tests/e2e/tmp/kv_store.db or use an env var such as TEST_DB_PATH
that the e2e runner populates), ensuring the test harness creates/cleans that
directory before/after each run so each E2E run uses a fresh KV store.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 08affa37-389a-4c86-8438-a9753c02d591

📥 Commits

Reviewing files that changed from the base of the PR and between 7c999d3 and c3a7172.

📒 Files selected for processing (6)

tests/e2e/configs/run-ci.yaml
tests/e2e/configuration/library-mode/lightspeed-stack-auth-noop-token.yaml
tests/e2e/configuration/library-mode/lightspeed-stack.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-auth-noop-token.yaml
tests/e2e/configuration/server-mode/lightspeed-stack.yaml
tests/e2e/features/faiss.feature

✅ Files skipped from review due to trivial changes (2)

tests/e2e/features/faiss.feature
tests/e2e/configuration/library-mode/lightspeed-stack-auth-noop-token.yaml

tests/e2e/configuration/library-mode/lightspeed-stack.yaml

coderabbitai

♻️ Duplicate comments (1)

tests/e2e/configuration/library-mode/lightspeed-stack.yaml (1)
28-28: ⚠️ Potential issue | 🟠 Major

FAISS_VECTOR_STORE_ID should not rely on implicit launcher wiring.

Line 28 still has no fallback/guard (${env.FAISS_VECTOR_STORE_ID}), so misconfigured launchers can end up with unresolved placeholder behavior at runtime. Please either add a deterministic default via :=... or enforce explicit startup validation for this env var.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/configuration/library-mode/lightspeed-stack.yaml` at line 28, The
YAML uses an unresolved environment placeholder for vector_db_id
(${env.FAISS_VECTOR_STORE_ID}) which can lead to runtime unresolved values;
either provide a deterministic default in the template (use the := form so
vector_db_id: ${env.FAISS_VECTOR_STORE_ID:=your-default-id}) or add explicit
startup validation that checks process.env.FAISS_VECTOR_STORE_ID and fails fast
with a clear error before using vector_db_id (update the launcher/startup
validation logic that reads FAISS_VECTOR_STORE_ID to validate presence and exit
with a helpful message).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@tests/e2e/configuration/library-mode/lightspeed-stack.yaml`:
- Line 28: The YAML uses an unresolved environment placeholder for vector_db_id
(${env.FAISS_VECTOR_STORE_ID}) which can lead to runtime unresolved values;
either provide a deterministic default in the template (use the := form so
vector_db_id: ${env.FAISS_VECTOR_STORE_ID:=your-default-id}) or add explicit
startup validation that checks process.env.FAISS_VECTOR_STORE_ID and fails fast
with a clear error before using vector_db_id (update the launcher/startup
validation logic that reads FAISS_VECTOR_STORE_ID to validate presence and exit
with a helpful message).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 93417b3f-b5cb-4a4c-8d6e-0ee974b3bd7d

📥 Commits

Reviewing files that changed from the base of the PR and between c3a7172 and aeb6cc3.

📒 Files selected for processing (7)

tests/e2e/configs/run-ci.yaml
tests/e2e/configuration/library-mode/lightspeed-stack-auth-noop-token.yaml
tests/e2e/configuration/library-mode/lightspeed-stack-inline-rag.yaml
tests/e2e/configuration/library-mode/lightspeed-stack.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-auth-noop-token.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-inline-rag.yaml
tests/e2e/configuration/server-mode/lightspeed-stack.yaml

✅ Files skipped from review due to trivial changes (2)

tests/e2e/configuration/library-mode/lightspeed-stack-inline-rag.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-inline-rag.yaml

🚧 Files skipped from review as they are similar to previous changes (4)

tests/e2e/configuration/server-mode/lightspeed-stack.yaml
tests/e2e/configuration/server-mode/lightspeed-stack-auth-noop-token.yaml
tests/e2e/configs/run-ci.yaml
tests/e2e/configuration/library-mode/lightspeed-stack-auth-noop-token.yaml

tisnik

It would be better not to introduce "magic" tags that change feature behaviours. Please use the standard approach - Background of feature or Given steps. TY in advance

are-ces

Removed magic attribute

tisnik · 2026-03-26T12:12:53Z

src/llama_stack_configuration.py

    return output


+def _resolve_env_var(value: str) -> str:


does not resolve_env implement it?

I see, there is a method in the llama-stack core library

addressed in new commit

tisnik · 2026-03-26T12:14:22Z

tests/e2e/features/steps/common.py

+
+    mode_dir = "library-mode" if context.is_library_mode else "server-mode"
+    if is_prow_environment():
+        config_path = f"tests/e2e-prow/rhoai/configs/{config_name}"


@radofuchs does it make sense to setup those paths in Background or Given?

we can set it up partially, but we still need to differentiate between library and server mode in the same test

radofuchs · 2026-03-26T13:21:23Z

tests/e2e/features/steps/common.py

+        config_path = f"tests/e2e/configuration/{mode_dir}/{config_name}"
+    create_config_backup("lightspeed-stack.yaml")
+    switch_config(config_path)
+    restart_container("lightspeed-stack")


for the same reason that is here for creating this step, we should also create a separate step for the lightspeed-stack restart as we are hiding this functionality from the user who would want to reproduce the behavior

addressed in new commit

radofuchs · 2026-03-26T13:30:15Z

tests/e2e/configuration/server-mode/lightspeed-stack-inline-rag.yaml

@@ -0,0 +1,41 @@
+name: Lightspeed Core Service (LCS)


create a copy of these new configs also in e2e-prow

addressed in new commit

radofuchs · 2026-03-26T13:32:28Z

tests/e2e/features/inline_rag.feature

+    """
+    {"query": "What does Paul Graham say about great work?"}
+    """
+    Then The status code of the response is 200


why did you skip the verification of the response content in this case?

Thought separate tests should test different things, but I can update the test to get more consistency in the results

radofuchs · 2026-03-26T13:32:56Z

tests/e2e/features/inline_rag.feature

+  Scenario: Responses API with inline RAG returns relevant content
+    Given The system is in default state
+      And I set the Authorization header to Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva
+    When I use "responses" to ask question with authorization header


add here also test where stream is set to true to cover all existing cases

addressed in new commit, TY

radofuchs · 2026-03-26T13:34:16Z

tests/e2e/features/inline_rag.feature

+      And The service uses the lightspeed-stack-inline-rag.yaml configuration
+
+  Scenario: Check if inline RAG source is registered
+    Given The system is in default state


first two lines of every test in this feature are identical. You can move them to background to improve readability

addressed in new commit

radofuchs · 2026-03-26T13:37:06Z

tests/e2e/features/steps/common.py

+    restart_container("lightspeed-stack")
+    # Library mode needs extra time to load embedding models after restart
+    wait_for_container_health("lightspeed-stack", max_attempts=12)
+


you never return to the original config file here, set some flag here that will trigger cleanup at the end of the scenario/feature

tisnik

LGTM from my perspective, but @radofuchs concerns should be solved too

Add end-to-end tests for inline RAG with BYOK configuration. Includes test configs for both server and library modes, feature file with 3 scenarios, and environment hooks for config switching. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The enrichment skipped duplicates by comparing vector_store_id values, but existing entries using ${env.VAR} patterns were compared as literal strings against resolved values from Pydantic config, causing false negatives and duplicate registration crashes in library mode. Also switch inline RAG test configs to noop auth so the Docker health check works correctly during container restarts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ation from run-ci Remove vector_stores pre-registration from run-ci.yaml so BYOK RAG is managed entirely through lightspeed-stack configs. Add byok_rag with rag.tool mode to default and auth-noop-token configs for both server and library modes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…nfig from run-ci Use ${env.KV_RAG_PATH:=~/.llama/storage/rag/kv_store.db} for BYOK db_path so tilde is properly resolved inside the container. Remove the FAISS vector_io provider and kv_rag storage backend from run-ci since the BYOK enrichment now creates them. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ds modules The llama-stack enrichment entrypoint runs llama_stack_configuration.py which imports constants and log modules from src/. Setting PYTHONPATH ensures these resolve correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move inline_rag.feature right after faiss.feature in the test order. MCP tests call clear_llama_stack_storage() which rm -rf ~/.llama inside the container, destroying the bind-mounted kv_store.db needed by RAG tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace the implicit @InlineRAG feature tag (handled in environment.py hooks) with a parameterized Background step "The service uses the {config_name} configuration" that switches config and restarts the container. The step is reusable for any config file and idempotent within a feature run via backup file existence check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The "Inline RAG query includes referenced documents" scenario was only checking rag_chunks and response format without verifying referenced documents. Replace with a dedicated referenced_documents assertion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add streaming Responses API inline RAG e2e scenario (stream: true) - Copy inline RAG config to e2e-prow/rhoai/configs for prow compatibility - Extract restart into explicit "The service is restarted" Gherkin step - Replace custom _resolve_env_var with llama_stack replace_env_vars Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

radofuchs

LGTM

* LCORE-1422: Inline RAG (BYOK) e2e tests - Add 6 e2e scenarios: query, streaming query, responses API, streaming responses API, referenced documents, RAG source registration - Add inline RAG configs for server, library, and prow modes - Add reusable "The service uses the {config} configuration" Background step - Fix BYOK RAG enrichment duplicate detection with env var references - Add after_feature safety net to restore config backups Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

tests/e2e/configuration/library-mode/lightspeed-stack.yaml Show resolved Hide resolved

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

tisnik requested changes Mar 23, 2026

View reviewed changes

are-ces force-pushed the LCORE-1422-inline-rag-e2e-tests branch from e671522 to 94f1c37 Compare March 24, 2026 09:15

are-ces commented Mar 24, 2026

View reviewed changes

are-ces requested review from radofuchs and tisnik March 24, 2026 14:31

are-ces marked this pull request as draft March 25, 2026 12:00

are-ces force-pushed the LCORE-1422-inline-rag-e2e-tests branch 2 times, most recently from a8efb6f to db6d857 Compare March 25, 2026 12:48

are-ces marked this pull request as ready for review March 25, 2026 14:00

are-ces marked this pull request as draft March 26, 2026 09:48

are-ces marked this pull request as ready for review March 26, 2026 11:19

are-ces force-pushed the LCORE-1422-inline-rag-e2e-tests branch from db6d857 to cda64ae Compare March 26, 2026 11:21

tisnik reviewed Mar 26, 2026

View reviewed changes

radofuchs reviewed Mar 26, 2026

View reviewed changes

tisnik approved these changes Mar 27, 2026

View reviewed changes

are-ces and others added 6 commits March 27, 2026 14:39

are-ces and others added 5 commits March 27, 2026 14:39

LCORE-1422: Add streaming and responses API inline RAG e2e tests

a9b4d2a

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

LCORE-1422: Fix streaming Responses API inline RAG test parser

fc8b5ec

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

are-ces force-pushed the LCORE-1422-inline-rag-e2e-tests branch from d14567a to fc8b5ec Compare March 27, 2026 14:20

radofuchs approved these changes Mar 27, 2026

View reviewed changes

are-ces merged commit e184541 into lightspeed-core:main Mar 27, 2026
19 of 24 checks passed

Conversation

are-ces commented Mar 22, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

are-ces left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

are-ces Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

radofuchs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

are-ces commented Mar 22, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 22, 2026 •

edited

Loading

are-ces Mar 26, 2026 •

edited

Loading